A Computationally Efficient Algorithm for Learning Topical Collocation Models
نویسندگان
چکیده
Most existing topic models make the bagof-words assumption that words are generated independently, and so ignore potentially useful information about word order. Previous attempts to use collocations (short sequences of adjacent words) in topic models have either relied on a pipeline approach, restricted attention to bigrams, or resulted in models whose inference does not scale to large corpora. This paper studies how to simultaneously learn both collocations and their topic assignments. We present an efficient reformulation of the Adaptor Grammar-based topical collocation model (AG-colloc) (Johnson, 2010), and develop a point-wise sampling algorithm for posterior inference in this new formulation. We further improve the efficiency of the sampling algorithm by exploiting sparsity and parallelising inference. Experimental results derived in text classification, information retrieval and human evaluation tasks across a range of datasets show that this reformulation scales to hundreds of thousands of documents while maintaining the good performance of the AG-colloc model.
منابع مشابه
COMPUTATIONALLY EFFICIENT OPTIMUM DESIGN OF LARGE SCALE STEEL FRAMES
Computational cost of metaheuristic based optimum design algorithms grows excessively with structure size. This results in computational inefficiency of modern metaheuristic algorithms in tackling optimum design problems of large scale structural systems. This paper attempts to provide a computationally efficient optimization tool for optimum design of large scale steel frame structures to AISC...
متن کاملOn the New Bivariate Local Linearisation Method for Solving Coupled Partial Differential Equations in Some Applications of Unsteady Fluid Flows with Heat and Mass Transfer
This work presents a new numerical approach for solving unsteady two-dimensional boundary layer flow with heat and mass transfer. The flow model is described in terms of a highly coupled and nonlinear system of partial differential equations that models the problem of unsteady mixed convection flow over a vertical cone due to impulsive motion. The proposed method of solution uses a local linear...
متن کاملComputationally Efficient Bayesian Learning of Gaussian Process State Space Models
Gaussian processes allow for flexible specification of prior assumptions of unknown dynamics in state space models. We present a procedure for efficient Bayesian learning in Gaussian process state space models, where the representation is formed by projecting the problem onto a set of approximate eigenfunctions derived from the prior covariance structure. Learning under this family of models ca...
متن کاملA Hybrid Unconscious Search Algorithm for Mixed-model Assembly Line Balancing Problem with SDST, Parallel Workstation and Learning Effect
Due to the variety of products, simultaneous production of different models has an important role in production systems. Moreover, considering the realistic constraints in designing production lines attracted a lot of attentions in recent researches. Since the assembly line balancing problem is NP-hard, efficient methods are needed to solve this kind of problems. In this study, a new hybrid met...
متن کاملAUTOMATED SIZING OF TRUSS STRUCTURES USING A COMPUTATIONALLY IMPROVED SOPT ALGORITHM
The present study attempts to apply an efficient yet simple optimization (SOPT) algorithm to optimum design of truss structures under stress and displacement constraints. The computational efficiency of the technique is improved through avoiding unnecessary analyses during the course of optimization using the so-called upper bound strategy (UBS). The efficiency of the UBS integrated SOPT algori...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015